Add Pipeline CRD for Redpanda Connect pipeline management#1337
Add Pipeline CRD for Redpanda Connect pipeline management#1337david-yu wants to merge 65 commits into
Conversation
|
This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
andrewstucki
left a comment
There was a problem hiding this comment.
Not sure if this is just a movement of connect pipeline reconcilers over from another repo, but would definitely want to change a chunk of the design around how this reconciliation works to be more inline with the patterns that this repo has before merging anything like this. Could we just add this in as part of a roadmap rather than trying to generate it? It shouldn't take more than a day or two to implement properly once we actually pull it in. But as is, there are a number of issues I see immediately with this PR that need changing:
- we try to use SSA semantics whenever possible, so the
CreateOrPatchandUpdatecalls are out-of-place. - not a huge fan of swallowing the status
Updateerrors on the reconcile calls, and it appears inconsistent -- some times it looks like we're returning the update error, sometimes swallowing it - we generally try and externalize our sub-resource definitions to some sort of "render" package to avoid having to inline everything
- this should likely use the
kube.Ctlsynchronization primitives - I'm assuming we'd probably want to run some of the secret stuff through cloud-secret materialization?
- would we want any of the configuration around Redpanda sources to somehow be pluggable with our clusterRef-style specification?
- this appears to not have created the RBAC policies in the proper place as it needs to be copies over to the helm chart itself
- the tests should actually test the reconciler, here they just do license validation
- I'd prefer to use some sort of enum/typed status information for the pipeline conditions, because what they are/do are basically undocumented right now
- at least one rendering test in the helm chart should test the enabling flag
- the CRD itself also needs to be added to the CRD installation process subcommand in order for this to ever work.
- for a new CRD type we should have at least one acceptance test that excercises the feature.
|
Moving back to draft mode. Thanks for taking a look. |
|
This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Introduces the Connect custom resource (shortName: rpcn) for managing Redpanda Connect pipelines via the Redpanda Operator. Each Connect CR declaratively specifies a pipeline configuration in YAML, and the controller reconciles the desired state by managing a Deployment and ConfigMap. Enterprise license gating: the controller validates a Redpanda enterprise license (v1 format from common-go/license) on every reconciliation. The license must include the CONNECT product and be unexpired. The license is read from a Kubernetes Secret referenced by spec.licenseSecretRef. Key components: - CRD types: Connect, ConnectSpec, ConnectStatus in v1alpha2 - Controller: creates/patches ConfigMap + Deployment, updates status - RBAC: ClusterRole permissions for connects, deployments, configmaps, secrets - CRD manifest: cluster.redpanda.com_connects.yaml - Gated behind --enable-connect flag (default: false) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update generated files to match what CI's controller-gen v0.20.1 and code generators produce: - Move Connect deepcopy functions to correct alphabetical position (after Configurator, before ConnectorMonitoring) - Regenerate CRD YAML with full OpenAPI schema from controller-gen - Update crd-docs.adoc with Connect type documentation - Add Connect deprecation test case - Update RBAC role.yaml to match controller-gen output - Add missing common-go/license go.sum entries in acceptance/ and gen/ - Fix whitespace in run.go Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fix TestCRDS by adding connects.cluster.redpanda.com to the expected CRD list and adding a Connect() helper function. Add Cloud-compatible fields to ConnectSpec for smooth migration to Redpanda Cloud managed Connect: - displayName: human-readable pipeline name - description: pipeline description - tags: key-value pairs for filtering/organization - configFiles: additional config files mounted at /config The controller now includes configFiles entries in the ConfigMap alongside connect.yaml, with a guard against key collision. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add displayName, description, tags, and configFiles documentation to the ConnectSpec section of the generated CRD docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add scheduling fields to ConnectSpec for spreading pipeline pods
across availability zones:
- zones: list of AZs to constrain and spread pods across. When set,
the controller auto-generates a node affinity (restrict to listed
zones) and a topology spread constraint (even distribution with
maxSkew=1, ScheduleAnyway) using topology.kubernetes.io/zone.
- tolerations: standard k8s tolerations for tainted nodes
- nodeSelector: label-based node selection
- topologySpreadConstraints: explicit spread constraints that
override the auto-generated zone constraint when provided
Example usage:
spec:
zones: ["us-east-1a", "us-east-1b", "us-east-1c"]
replicas: 3
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update connects CRD YAML with full TopologySpreadConstraint schema instead of x-kubernetes-preserve-unknown-fields, expand toleration descriptions, fix field ordering (nodeSelector before paused), and update crd-docs.adoc descriptions to match Go struct comments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Connect controller is now enabled by default (--enable-connect=true). Users can disable it via the operator helm chart value: helm install redpanda-operator ... --set connectController.enabled=false Individual Connect pipeline CRs still require an enterprise license with the CONNECT product — enabling the controller alone does not grant enterprise functionality. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update README, template, schema, partial types, and golden files to include the new connectController chart value. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Make spec.licenseSecretRef optional on Connect CRs. When not set, the controller falls back to the operator-level enterprise license configured via enterprise.licenseSecretRef in the operator Helm chart values. This avoids requiring users to specify the license on every Connect pipeline CR. The operator-level license is passed via --license-file-path and mounted from the chart's enterprise.licenseSecretRef secret. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove spec.licenseSecretRef from Connect CRD entirely. License is now only configured at the operator level via enterprise.licenseSecretRef in the operator Helm chart values. - Set connectController.enabled to false by default (opt-in). - Simplify controller license validation to only read from the operator-level license file path. - Add unit tests for license validation covering: no license configured, invalid file, expired license, open source license, V0 enterprise license with all products, V1 enterprise with/without CONNECT product, V1 trial license, and V1 expired enterprise license. - Fix values.schema.json alphabetical ordering (connectController before crds). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Match the v2 Redpanda CRD convention (Budget.MaxUnavailable *int) rather than the v1 IntOrString pattern. Removes MinAvailable and percentage support for a simpler API surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
|
This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
|
This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Resolve conflicts in three generated/test files; all conflicts were mechanical (both branches added new entries to the same files): - operator/chart/values_partial.gen.go: keep PartialConnectMonitoringConfig alongside the new PartialMulticlusterService struct from main. - operator/chart/testdata/template-cases.txtar: keep the connect-controller test cases alongside the new multicluster service test cases. - operator/chart/testdata/template-cases.golden.txtar: regenerated via `go test ./operator/chart/... -run TestTemplate -update-golden` so the chart version labels reflect main's bump to v26.2.1-beta.1. Also correct CLAUDE.md: chart template tests use `-update-golden`, not `-update`. The TxTar golden helper checks `goldenfile.Update()` which is gated only by the `-update-golden` flag, so `-update` silently no-ops on these tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The Pipeline controller already validated that the operator-level enterprise
license includes the Connect product, but the rendered Connect pod ran with
the open-source default license. Enterprise inputs like mysql_cdc hit Connect's
own runtime license gate ("this feature requires a valid Redpanda Enterprise
Edition license that includes the Connect product") and crashed, forcing users
to manually wire the license into a `spec.secretRef`.
The renderer now mirrors the operator's license bytes into a Pipeline-owned
Secret (`<pipeline>-license`) and injects `REDPANDA_LICENSE` into the connect
and lint containers via a SecretKeyRef. The Secret is owned by the Pipeline CR
so it GCs cleanly on delete, and lives in the Pipeline's own namespace so no
cross-namespace secret references are needed. RBAC for `secrets` is widened
from `get;list;watch` to include `create;update;patch;delete`.
Also fix the PodMonitor CRD-presence check: `errors.Is(err,
&meta.NoKindMatchError{})` was comparing pointer addresses (NoKindMatchError
has no `Is` method), so the intended fast-path never ran. Replaced with
`meta.IsNoMatchError(err)`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end test:
|
|
Going to mark ready for review given was able to run an end to end with a few connectors successfully |
The previous commit widened the Pipeline controller's RBAC marker for `secrets` from `get;list;watch` to also include `create;update;patch;delete` so the renderer can manage the per-Pipeline license Secret. The pipeline ClusterRole was regenerated, but the operator chart golden file (which captures the rendered chart output for the connect-controller-enabled, connect-controller-with-license, and connect-monitoring-enabled cases) was not. Regenerated via: go test ./operator/chart/... -run TestTemplate -update-golden Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
This PR is stale because it has been open 5 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
# Conflicts: # acceptance/main_test.go
Reworks the Pipeline spec contract per review feedback. The bag-of-Secret
env-splat pattern is gone; instead the spec is statically typed against the
same Kubernetes-native primitives the other CRDs already use.
Spec changes (operator/api/redpanda/v1alpha2/pipeline_types.go):
- Drop: spec.env ([]corev1.EnvVar), spec.secretRef ([]corev1.LocalObjectReference).
- Add: spec.valueSources ([]NamedValueSource) — one typed env-var
projection per entry, sourced from inline / configMapKeyRef /
secretKeyRef / externalSecretRef (same ValueSource primitive the
User CR uses).
- Add: spec.userRef (*PipelineUserRef) — binds the pipeline to a User
CR. The operator reads the referenced User's password Secret +
SASL mechanism and uses User.metadata.name as the SASL username.
The User CR remains user-managed (operator does not auto-create
it) so ACL scoping stays auditable.
- Existing spec.cluster (*ClusterSource) semantics expanded: when set,
the operator generates a top-level `redpanda` block in the
rendered connect.yaml (seed_brokers, tls.root_cas_file, sasl)
so user YAML never has to hardcode brokers/TLS/SASL. User-side
`redpanda` keys merge on top of the generated block.
- CEL on PipelineSpec enforces the dichotomy: userRef required with
cluster.clusterRef, forbidden with cluster.staticConfiguration,
forbidden without cluster.clusterRef.
Controller changes (operator/internal/controller/pipeline/):
- cluster.go: new resolveUserRef() + userCredentials type.
- render.go: drop EnvFrom on lint+connect containers; replace with
buildValueSourceEnv (per-key typed env). New renderConnectYAML
parses spec.configYaml, injects the generated `redpanda` block
(user keys win on merge), and re-emits.
- controller.go: resolve userRef alongside clusterRef; surface
PipelineConditionUserRef / PipelineReasonUserResolved or
PipelineReasonUserInvalid; pass userCredentials into the
renderer.
Test (controller_test.go):
- TestRender_Deployment_SecretRef → TestRender_Deployment_ValueSources,
asserts envFrom is empty and that each ValueSource entry projects as a
single typed EnvVar.
- TestReconcile_InvalidClusterRefCleansUpManagedResources: add a stub
userRef so the CEL-validated spec is accepted; cluster resolution
still surfaces ClusterRefInvalid first.
Regenerated:
- zz_generated.deepcopy.go, CRD YAML, crd-docs.adoc, chart template
golden, applyconfiguration helpers.
Lint clean (helm lint + golangci-lint + actionlint).
Package tests pass: go test ./operator/internal/controller/pipeline/.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end test: EKS + RDS, new spec shape, Redpanda Connect v4.92.0Re-ran the Stack
Result: ✅ Pass
What the new spec actually buys youThe CR for the pipeline has no inline brokers / TLS / SASL credentials — the operator generates the top-level Pipeline CR (excerpt): apiVersion: cluster.redpanda.com/v1alpha2
kind: Pipeline
metadata: { name: mysql-cdc-orders, namespace: redpanda }
spec:
# No `image:` here — the chart-level default `connectController.image`
# set during `helm install` (step 5 below) pins this Pipeline (and every
# other Pipeline in the cluster) to redpandadata/connect:4.92.0.
# Per-Pipeline `.spec.image` still wins if it's set; see
# `TestRender_Deployment_ImagePrecedence` for the three-tier contract.
cluster:
clusterRef: { name: redpanda }
userRef:
name: mysql-cdc-orders-svc
valueSources:
- name: MYSQL_PASSWORD
source:
secretKeyRef: { name: mysql-cdc-creds, key: password }
- name: MYSQL_HOST
source:
secretKeyRef: { name: mysql-cdc-creds, key: host }
- name: MYSQL_USER
source:
secretKeyRef: { name: mysql-cdc-creds, key: username }
configYaml: |
input:
mysql_cdc:
dsn: "${MYSQL_USER}:${MYSQL_PASSWORD}@tcp(${MYSQL_HOST}:3306)/shop"
tables: [orders]
stream_snapshot: true
flavor: mysql
checkpoint_cache: mysql_cdc_orders_checkpoint
checkpoint_key: mysql_cdc_orders_checkpoint
checkpoint_limit: 1024
pipeline:
processors:
- mapping: |
root = this
root.cdc_received_at = now()
output:
redpanda_common: # ← shared-client output; uses the operator-generated `redpanda:` block
topic: "mysql.shop.orders"
key: '${! @table }'
cache_resources:
- label: mysql_cdc_orders_checkpoint
memory: { default_ttl: 1h }Rendered cache_resources:
- label: mysql_cdc_orders_checkpoint
memory:
default_ttl: 1h
input:
mysql_cdc:
checkpoint_cache: mysql_cdc_orders_checkpoint
checkpoint_key: mysql_cdc_orders_checkpoint
checkpoint_limit: 1024
dsn: ${MYSQL_USER}:${MYSQL_PASSWORD}@tcp(${MYSQL_HOST}:3306)/shop
flavor: mysql
snapshot_max_batch_size: 100
stream_snapshot: true
tables:
- orders
output:
redpanda_common:
key: ${! @table }
topic: mysql.shop.orders
pipeline:
processors:
- mapping: |
root = this
root.cdc_received_at = now()
redpanda: # ← generated from cluster.clusterRef + userRef
sasl:
- mechanism: SCRAM-SHA-512
password: ${REDPANDA_SASL_PASSWORD} # injected env from User CR's password Secret
username: ${REDPANDA_SASL_USERNAME} # injected env (literal: "mysql-cdc-orders-svc")
seed_brokers:
- redpanda-0.redpanda.redpanda.svc.cluster.local.:9093Pod env vars projected by the operator:
The Pipeline CR contains zero plaintext credentials. Reproduction steps1. Prerequisites
2. Terraform (AWS infra)The
cd pr1337-eks-rds/terraform
terraform init
terraform plan -out tfplan
terraform apply tfplan3. Build + push the operator imagecd <redpanda-operator repo at e7a2da66>
BUILD_GOOS=linux BUILD_GOARCH=amd64 go build -C ./operator \
-o ../.build/redpanda-operator-linux-amd64 \
-ldflags '...' ./cmd/main.go
BUILD_GOOS=linux GOARCH=amd64 go build -C ./alias \
-o ../.build/alias-linux-amd64 \
-ldflags '-X "main.AliasTo=/redpanda-operator run"' .
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin \
605419575229.dkr.ecr.us-east-2.amazonaws.com
docker buildx build --platform linux/amd64 --provenance false --sbom false \
--file operator/Dockerfile --target=manager \
--tag 605419575229.dkr.ecr.us-east-2.amazonaws.com/redpanda-operator-pr1337:e7a2da66 \
--push .build4. cert-manager + ESO + EBS CSIaws eks update-kubeconfig --region us-east-2 --name rp-pr1337-eks-rds
kubectl create namespace redpanda
# cert-manager
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.17.2/cert-manager.yaml
kubectl -n cert-manager wait --for=condition=Available deploy --all --timeout=5m
# ESO with IRSA
helm repo add external-secrets https://charts.external-secrets.io && helm repo update
helm upgrade --install external-secrets external-secrets/external-secrets \
--namespace external-secrets --create-namespace \
--set installCRDs=true \
--set 'serviceAccount.annotations.eks\.amazonaws\.com/role-arn=arn:aws:iam::605419575229:role/rp-pr1337-eks-rds-eso' \
--wait
# EBS CSI driver — EKS 1.31 doesn't auto-install it
aws iam attach-role-policy --role-name <node-role> \
--policy-arn arn:aws:iam::aws:policy/service-role/AmazonEBSCSIDriverPolicy
aws eks create-addon --cluster-name rp-pr1337-eks-rds --region us-east-2 \
--addon-name aws-ebs-csi-driver --resolve-conflicts OVERWRITE
kubectl patch storageclass gp2 -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'5. Operator (PR 1337 chart)kubectl -n redpanda create secret generic redpanda-license \
--from-file=license=/path/to/redpanda.license
helm upgrade --install redpanda-operator <pr-1337 repo>/operator/chart \
--namespace redpanda \
--set image.repository=605419575229.dkr.ecr.us-east-2.amazonaws.com/redpanda-operator-pr1337 \
--set image.tag=e7a2da66 \
--set image.pullPolicy=IfNotPresent \
--set rbac.createAdditionalControllerCRs=false \
--set connectController.enabled=true \
# NEW chart value (commit 002fd26f): chart-level default Connect image.
# Every Pipeline CR that doesn't pin its own .spec.image inherits this,
# so the Connect runtime version is controlled at install time rather
# than per-CR. Per-Pipeline .spec.image still wins; if neither is set,
# the operator falls back to the PipelineDefaultImage constant baked
# into the binary.
--set connectController.image.repository=docker.redpanda.com/redpandadata/connect \
--set connectController.image.tag=4.92.0 \
--set enterprise.licenseSecretRef.name=redpanda-license \
--set enterprise.licenseSecretRef.key=license \
--set crds.enabled=true --set crds.experimental=true \
--wait6. Redpanda CR (SASL on, 1 broker)kubectl -n redpanda create secret generic redpanda-bootstrap-user \
--from-literal=password=$(openssl rand -hex 16)
kubectl -n redpanda create secret generic users --from-literal=placeholder=ignored
# 01-redpanda.yaml: clusterSpec.auth.sasl.enabled=true, bootstrapUser→redpanda-bootstrap-user,
# users[]: empty (users come from User CR), tls disabled for simplicity.
kubectl apply -f manifests/01-redpanda.yaml
kubectl -n redpanda wait redpanda/redpanda --for=condition=Ready --timeout=10m7. ESO sync of the RDS password# 02-eso.yaml: ClusterSecretStore + ExternalSecret pulling rp-pr1337-eks-rds/cdc-user from
# AWS Secrets Manager into a K8s Secret "mysql-cdc-creds" with keys {password, host, username}.
sed -e 's|REPLACE_REGION|us-east-2|g' \
-e 's|REPLACE_CDC_SECRET_NAME|rp-pr1337-eks-rds/cdc-user|g' \
manifests/02-eso.yaml | kubectl apply -f -
kubectl -n redpanda wait --for=condition=Ready externalsecret/mysql-cdc-creds --timeout=2m8. Bootstrap RDS MySQLkubectl apply -f manifests/05-mysql-bootstrap-job.yaml
kubectl -n redpanda wait --for=condition=Complete job/mysql-bootstrap --timeout=5m
# RDS doesn't grant RELOAD; grant the MySQL 8.0 dynamic privilege FLUSH_TABLES + LOCK TABLES instead
kubectl apply -f manifests/06-grant-flush-tables.yaml
kubectl apply -f manifests/07-grant-lock-tables.yaml9. User CR + Pipeline CR# 03-pipeline-user.yaml: User CR scoped to write topic mysql.shop.orders + Group prefix "mysql-cdc-orders"
# 04-pipeline.yaml: Pipeline CR with cluster.clusterRef + userRef + valueSources (see above)
kubectl apply -f manifests/03-pipeline-user.yaml
kubectl -n redpanda wait --for=condition=Synced user/mysql-cdc-orders-svc --timeout=2m
kubectl apply -f manifests/04-pipeline.yaml
kubectl -n redpanda wait --for=condition=Ready pipeline/mysql-cdc-orders --timeout=5m10. Verify CDCPASS=$(kubectl -n redpanda get secret redpanda-bootstrap-user -o jsonpath='{.data.password}' | base64 -d)
# 5 snapshot rows
kubectl -n redpanda exec redpanda-0 -c redpanda -- \
rpk topic consume mysql.shop.orders -n 5 --offset start \
--user kubernetes-controller --password "$PASS" \
--sasl-mechanism SCRAM-SHA-256 \
--brokers redpanda-0.redpanda.redpanda.svc.cluster.local.:9093
# Live insert
kubectl apply -f - <<EOF2
apiVersion: batch/v1
kind: Job
metadata: { name: mysql-insert-frank, namespace: redpanda }
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- name: ins
image: mysql:8.0
env:
- { name: MYSQL_HOST, valueFrom: { secretKeyRef: { name: rds-master-bootstrap, key: host } } }
- { name: MYSQL_USER, valueFrom: { secretKeyRef: { name: rds-master-bootstrap, key: username } } }
- { name: MYSQL_PWD, valueFrom: { secretKeyRef: { name: rds-master-bootstrap, key: password } } }
command: ["mysql","-h","\$(MYSQL_HOST)","-u","\$(MYSQL_USER)","-e","USE shop; INSERT INTO orders (customer,product,qty) VALUES ('frank','gear',11);"]
EOF2
# Verify frank appears
kubectl -n redpanda exec redpanda-0 -c redpanda -- \
rpk topic consume mysql.shop.orders -n 1 --offset 10 \
--user kubernetes-controller --password "$PASS" \
--sasl-mechanism SCRAM-SHA-256 \
--brokers redpanda-0.redpanda.redpanda.svc.cluster.local.:9093Notes / friction points worth knowing about
Update 2026-05-16 — Re-test on EKS 1.34 with native RDS IAM database authenticationRe-ran the e2e against EKS 1.34 using native RDS IAM database authentication via Result: ✅ Pass with native IAM auth
New / corrected friction pointsN1 — Recommended redesign (inline-merge approach): the operator should scan the user-supplied A working example of what the rendered config should look like (after the inline-merge): output:
redpanda:
seed_brokers:
- redpanda-0.redpanda.redpanda.svc.cluster.local.:9093
sasl:
- mechanism: SCRAM-SHA-512
username: ${REDPANDA_SASL_USERNAME}
password: ${REDPANDA_SASL_PASSWORD}
topic: "mysql.shop.orders"
key: '${! @table }'The user authored: output:
redpanda:
topic: "mysql.shop.orders"
key: '${! @table }'N6 (CORRECTED) — Native RDS IAM auth works today; no Pipeline CR primitive is required. Suggested follow-up: expose N7 (NEW) — N8 (NEW) — IAM auth requires N9 (NEW) — RDS bootstrap SQL: Stack (this run)
Pipeline CR (this run; uses
|
Two changes bundled to address two distinct review/CI signals:
1) Add `connectController.image.{repository,tag}` to the operator chart.
New chart value plumbs through as `--connect-default-image` on the
operator Deployment command. Pipeline.GetImage()'s precedence becomes:
a. Pipeline.spec.image (per-Pipeline override, still highest)
b. --connect-default-image (chart-level default, NEW)
c. PipelineDefaultImage constant (binary-baked fallback)
No CRD change. Existing Pipeline CRs continue to work unchanged. The
chart value lets operators standardize the Connect runtime version
across every pipeline without each Pipeline author remembering to set
.spec.image. Unit-tested via TestRender_Deployment_ImagePrecedence
(3 subtests, one per precedence tier).
2) Relax the "userRef required when cluster.clusterRef is set" CEL.
That rule rejected the existing pipeline-crds.feature acceptance
scenarios, which use clusterRef on an unauthenticated cluster (the
test's `basic` cluster has SASL disabled). Drops the strict rule;
keeps the two genuinely-correctness-defending rules:
- userRef must NOT be set alongside staticConfiguration
- userRef must NOT be set without clusterRef
UserRef is now an opt-in for SASL-enabled clusters. Updated the
docstring on UserRef to spell out the new semantics; reverted the
stub-userRef workaround in TestReconcile_InvalidClusterRefCleansUp-
ManagedResources.
Pipeline render layer is already correct: the auto-generated `redpanda:`
block's `sasl:` section is gated on `r.userCredentials != nil`, so a
clusterRef-only Pipeline gets a SASL-less redpanda block (brokers + TLS
only) — exactly what the acceptance tests want.
Verified locally:
go test ./operator/internal/controller/pipeline/ -timeout 5m → ok
go test ./operator/chart/ → ok
task lint → clean
The remaining acceptance failures in build #13661 (migration,
console-upgrades, operator-upgrades, *-crds vectorized variants) are
pre-existing cert-manager-webhook-race infra flakes — not PR 1337's
doing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…o output.redpanda Two related design changes on top of the v2 Pipeline spec: - Add Pipeline.spec.serviceAccountName so the IRSA / Workload Identity trust boundary can be scoped per-pipeline instead of relying on the namespace's default ServiceAccount. The operator does not create the SA; users provision it with cloud-IAM annotations out-of-band. - Drop the auto-generated top-level `redpanda:` block (which only fed the deprecated `redpanda_common` plugin). When the Pipeline binds to a cluster, the operator now inline-merges seed_brokers, tls, and sasl into any input.redpanda / output.redpanda blocks in the user's configYaml. Users write topic / key / consumer_group; the operator fills in the connection plumbing. User-supplied keys still win on conflict. Tests cover the four merge cases (output, input, user-wins, no-plugin), guard against the redpanda_common block re-appearing, verify the fully-inline pipeline still passes through untouched, and exercise the new serviceAccountName field both set and unset. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…" keyword The ConfigYAML docstring described `redpanda_common` as "deprecated" and `redpanda` as "non-deprecated". The `rp-controller-gen deprecations` generator scans v1alpha2 godoc for the word "deprecated" and adds any hit to zz_generated.deprecations_test.go's TODO list. The build's ci:lint step (which runs `task generate` and then `git diff --exit-code`) failed because CI's regeneration added Pipeline.ConfigYAML to that TODO list while the committed file didn't have it. Rephrased to "the merge targets the `redpanda` input/output plugins specifically; `redpanda_common` blocks are passed through unchanged." Same semantics, no trigger word. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Adds a
PipelineCRD (cluster.redpanda.com/v1alpha2) that manages Redpanda Connect pipelines as first-class Kubernetes resources. The spec is statically typed against the same Kubernetes-native primitives the rest of the v1alpha2 CRDs already use:cluster(*ClusterSource) — same primitive Topic/User use. Either point at a Redpanda CR (clusterRef) or supply brokers/TLS/SASL inline (staticConfiguration). When set, the operator inline-mergesseed_brokers,tls, andsaslinto anyinput.redpanda/output.redpandablocks the user wrote inconfigYaml. The non-deprecatedredpandaplugin family is the merge target; the legacyredpanda_commonis not auto-configured.userRef— optional alongsidecluster.clusterRef. Binds the pipeline to aUserCR; the operator reads the User's Secret-backed password + SCRAM mechanism and usesUser.metadata.nameas the SASL username. The User CR stays user-managed so ACL scoping is auditable (operator does not auto-create or modify it). Omit for unauthenticated clusters.Why
userRefis flat whilecluster.clusterRefis wrapped.clusteris aClusterSource— a discriminated union of two sources: point at a Redpanda CR (clusterRef: { name }) OR supply brokers/TLS/SASL inline (staticConfiguration: {...}). The wrapping exists to express that union, and it matches the existingTopic/UserCRDs (spec.cluster: *ClusterSource).userRefhas only one source today — point at aUserCR by name — so it stays flat:userRef: { name }. The inline-SASL counterpart already lives atcluster.staticConfiguration.kafka.sasl.{mechanism,username,password}(also flat — also a single source), and CEL forbids combining it withuserRef. If a future use case needs a second user-identity source (e.g., inline SASL without aUserCR backing it), we can promoteuserRefinto auser: UserSourcewrapper at that point.serviceAccountName— theServiceAccountbound to the pipeline pod. When unset, the namespace's default SA is used. Set this to scope cloud-IAM trust (IRSA on EKS, Workload Identity on GKE, Pod Identity on AKS) per-pipeline rather than sharing the namespace's default SA across every pipeline. The operator does not create the SA — provision it (with the cloud-IAM annotations) out-of-band.valueSources— typed list of named env-var projections (one named pull per entry) backed by inline /configMapKeyRef/secretKeyRef/externalSecretRef. Replaces the earliersecretRef[]env-splat andenv[]rawcorev1.EnvVarapproaches.image— per-pipeline Connect runtime override. Three-tier precedence:Pipeline.spec.imagewins, then the chart-levelconnectController.image.{repository,tag}default (plumbed in via the operator's--connect-default-imageflag), then the binary-bakedPipelineDefaultImageconstant.configYaml— the user's Connect pipeline YAML. Stays the inline catch-all for anything the typed fields don't cover.CEL on
PipelineSpecenforces the contract:userRefis forbidden alongsidecluster.staticConfiguration(the static path carries its own inline SASL), anduserRefis forbidden withoutcluster.clusterRef(no cluster context to authenticate against otherwise).userRefis otherwise an opt-in for SASL-enabled clusters.Worked examples
A. Cluster-bound — Pipeline points at a Redpanda CR on the same Kubernetes cluster
The user provisions a SCRAM
UserCR with ACLs scoped to what the pipeline reads/writes; thePipelinethen references both the cluster and the user.User CR (separate manifest, owns the SCRAM identity + ACLs):
Pipeline CR:
What the operator renders into the pod's
/config/connect.yaml:Pod env (auto-derived):
REDPANDA_SASL_USERNAME="orders-to-warehouse"(literal, fromUser.metadata.name)REDPANDA_SASL_MECHANISM="SCRAM-SHA-512"(literal, fromUser.spec.authentication.type)REDPANDA_SASL_PASSWORD←secretKeyRef: { orders-to-warehouse-password, password }(fromUser.spec.authentication.password.valueFrom.secretKeyRef)S3_SECRET_KEY←secretKeyRef: { s3-creds, secret_access_key }(fromvalueSources)User keys win on conflict. If the user had written
seed_brokers: [external.example.com:9093]insideinput.redpanda, the operator would have left that value untouched and only injected the missingtlsandsaslkeys. That's the escape hatch for cluster-bound pipelines that need to point a specific input/output at a different cluster.B. External Kafka / BYOC — static configuration
For pipelines reaching an external Redpanda or Kafka the operator doesn't run (Redpanda Cloud BYOC, cross-region tap, Confluent Cloud, MSK, etc.). No
userRef; SASL credentials live inline on the static config and the password is itself aValueSource.Pipeline CR:
What the operator renders:
staticConfigurationandclusterRefproduce the same inline-merge contract — only the source-of-truth for the connection fields differs. NoUserCR involved.C. Per-pipeline IRSA — native RDS IAM database authentication via
serviceAccountNamePipeline binds to a Redpanda cluster for its output and to RDS for its CDC input. The pipeline pod itself calls AWS APIs (
rds:GenerateDBAuthToken) using an IAM role assumed via IRSA.serviceAccountNamescopes that trust to this one Pipeline — no other workload in theredpandanamespace can assume the role.Out-of-band: ServiceAccount with the IRSA annotation (terraform / pulumi / a separate manifest — the operator does not create it):
Pipeline CR:
The pipeline pod assumes
mysql-cdc-orders-rds(and only that role) via the projected service-account token at/var/run/secrets/eks.amazonaws.com/serviceaccount/token. The MySQL connection uses an IAM-generated token; no MySQL password lives anywhere in the pod, the Pipeline CR, or a Secret.This is the production K8s-RDS pattern: IRSA gates AWS API access (so the pipeline can mint MySQL tokens); Pipeline
userRefgates Redpanda access (so the pipeline can write its output topic). The two trust boundaries are orthogonal.D. Inline — pipeline references multiple external sources via
valueSourcesFor pipelines whose primary connection isn't Redpanda at all, or that fan out to multiple non-Kafka backends.
valueSourcesis destination-agnostic: each entry is a typed pull from inline / Secret / ConfigMap / ExternalSecret and projects to an env var the YAML references via${NAME}. Connect plugins read those env vars however they expose credentials.Properties this design intentionally preserves:
input.redpandaandoutput.redpandablocks. New Connect plugins ship in future Connect releases without any operator change; non-redpandablocks (mongodb,snowflake_put,sql_insert,aws_s3,mysql_cdc, etc.) pass through untouched.inline/secretKeyRef/configMapKeyRef/externalSecretRefcan mix freely across entries in the samevalueSourceslist.secretRef[]env-splat, every value is a named pull. Unused keys in a Secret don't leak into the pod env, the env name and the Secret's key can differ, and multiple pipelines can pull non-overlapping keys from the same Secret.cluster+userRefgates Redpanda access;serviceAccountNamegates cloud-IAM access. Either can be set without the other.Status conditions
ClusterRefcluster.clusterRefresolved → broker list + TLS material loadedClusterRefInvalid(cluster not found / not Ready)UserRefuserRef.nameexists, haspassword.valueFrom.secretKeyRefset, mechanism resolvedUserInvalid(User CR not found, missing Secret-backed password)ConfigValidredpanda-connect lintpassesConfigInvalidReadyTests
TestRender_InlineMergesRedpandaPlugins— six subtests covering the cluster-binding render path: merges intooutput.redpanda, merges intoinput.redpanda, user-supplied keys win on conflict, no*.redpandablock in user config → no injection,output.redpanda_commonis intentionally not auto-configured (regression guard against re-emitting a top-levelredpanda:block), fully-inline pipeline (no cluster binding) passes through unchanged.TestRender_Deployment_ServiceAccountName— propagation ofspec.serviceAccountNametoDeployment.Spec.Template.Spec.ServiceAccountName; empty when unset.TestRender_Deployment_ImagePrecedence— three subtests, one per image precedence tier (per-pipelinespec.image> chart-levelconnectController.image> binary constant).TestRender_Deployment_ValueSources— assertsEnvFromis empty on both the lint init and the connect container, and that eachValueSourceentry projects as exactly one typedEnvVar(inline →Value;secretKeyRef/configMapKeyRef→ValueFrom.{Secret,ConfigMap}KeyRef).TestReconcile_InvalidClusterRefCleansUpManagedResources—cluster.clusterRefresolution failure short-circuits before user resolution; status surfacesClusterRefInvalidand managed resources are torn down.TestRender_*cases cover Replicas, Paused, Zones, Resources, Annotations, Topology, Budget, ConfigFiles, MonitoringPodMonitor.task lint(helm lint + golangci-lint + actionlint) clean.task generateclean (no further diff).End-to-end validation
The branch was exercised against real AWS infrastructure (EKS 1.34 + RDS MySQL 8.0 with
iam_database_authentication_enabled=true) using the Example C scenario (per-pipeline IRSA + native RDS IAM auth + Pipeline CR writing to Redpanda). MySQL snapshot rows reached themysql.shop.orderstopic with no MySQL password anywhere in the Pipeline CR, the pod env, or any Secret. See the e2e comment thread for the full run + reproduction steps.